要真正掌握 Git,必須深入探索其使用者友善介面之下的 外殼 命令,以理解 底層機制——即管理 Git 內部物件資料庫的低階引擎。此資料庫是一種內容位址式的檔案系統,其中每個資料片段都以不可變動的物件形式儲存。
1. 外殼與底層機制
外殼 指的是高階指令(例如 git status),專為人類互動而設計。 底層機制 則指直接操作 Git 物件資料庫的低階指令,使我們能取得 Git 真正的內部表示方式。
2. 物件資料庫
Git 在 .git/objects 目錄中運作,存放內部物件: blob、 樹狀結構、 提交以及 標籤。分支雖常與物件一同討論,但實際上只是指向這些提交的參考。
3. SHA-1 地址編碼
每個物件皆以唯一的 40 字元十六進位 SHA-1 檢查碼命名。Git 為優化儲存空間,會將前兩個字元作為子目錄名稱(例如 af/),並將剩下的 38 個字元作為檔名。
main.py
TERMINALbash — 80x24
> Ready. Click "Run" to execute.
>
QUESTION 1
Which of the following best describes 'Porcelain' commands in Git?
Low-level commands for manual database manipulation.
High-level commands meant for day-to-day user interaction.
A specific type of encrypted SSH key.
The process of garbage collection.
✅ Correct!
Porcelain commands like 'git status' and 'git commit' are the user-friendly interface we use daily.❌ Incorrect
Low-level commands are known as 'Plumbing'; Porcelain is the high-level layer.QUESTION 2
Where does Git store its internal object database?
.git/config
.git/objects
.git/refs
.git/hooks
✅ Correct!
The '.git/objects' folder is the content-addressable store for all blobs, trees, and commits.❌ Incorrect
The '.git/config' file stores settings; objects are stored in '.git/objects'.QUESTION 3
Which internal Git object represents the content of a single file?
Tree
Blob
Commit
Tag
✅ Correct!
Blobs (Binary Large Objects) store the raw content of files without filenames.❌ Incorrect
Trees represent directory structures; Blobs represent the file content itself.QUESTION 4
How does Git create the 40-character identifiers for its objects?
Using a random number generator.
Using SHA-1 checksums of the object's content.
By sequential numbering of commits.
Using the timestamp of the operation.
✅ Correct!
SHA-1 ensures that every object’s contents is never corrupted without Git knowing about it.❌ Incorrect
Git uses deterministic hashing (SHA-1), not random or sequential numbers.QUESTION 5
What happens at the plumbing level when you run 'git status'?
It simply reads a single text file labeled 'status.txt'.
It compares the working directory against the 'Object Web' (blobs and trees).
It deletes all untracked files automatically.
It re-encrypts the entire repository.
✅ Correct!
Git traverses the tree and blob objects to determine differences between states.❌ Incorrect
Git is performing a complex plumbing operation behind the scenes, comparing the index and object database.Case Study: Investigating the Object Store
Understanding File System Architecture in Git
To understand how Git maintains data integrity and efficiency, we must examine the physical storage on disk. Since every piece of data (blobs, trees, tags, and commits) is immutable, Git must organize these files in a way that remains performant even with thousands of objects. This foundational structure relies on the SHA-1 hashing algorithm to 'name' files.
Q
In your my-git-repo repository, open the folder .git/objects and identify how Git stores objects using SHA-1 checksums. (Word count requirement: 25 words minimum).
Solution:
Git calculates a 40-character SHA-1 hash for each object. It uses the first two characters to create a subdirectory (e.g., /af/) and uses the remaining 38 characters as the filename within that folder. This structure prevents thousands of files from cluttering a single directory, ensuring the file system remains efficient.
Git calculates a 40-character SHA-1 hash for each object. It uses the first two characters to create a subdirectory (e.g., /af/) and uses the remaining 38 characters as the filename within that folder. This structure prevents thousands of files from cluttering a single directory, ensuring the file system remains efficient.
Q
Why is it impossible to change a file in Git's history without changing its SHA-1 hash?
Solution:
Because the hash is calculated directly from the content. This makes Git a content-addressable filesystem; any change to even a single bit in a blob results in a completely different SHA-1 ID, ensuring total data integrity.
Because the hash is calculated directly from the content. This makes Git a content-addressable filesystem; any change to even a single bit in a blob results in a completely different SHA-1 ID, ensuring total data integrity.